Graphical User Interface (Gui) Noise Reduction in a Cognitive Control Framework

ABSTRACT

Reducing graphical user interface (GUI) noise maybe achieved by recording a first execution scenario for control of operation of an application program having a GUI during a recording phase of operation of a cognitive control framework system, setting soft conditions for a search for the application program for the first execution scenario, playing back the application program according to the first execution scenario during a playback phase of operation of the cognitive control framework system, updating the first execution scenario to form a second execution scenario to reduce GUI noise conditions observed during playback, including updating recorded images originally generated by the GUI during the recording phase and updating coordinates for user input data, setting stronger conditions for the search for use in subsequent playbacks; and playing back the application program according to the second execution scenario with the stronger conditions for search.

BACKGROUND

1. Field

The present invention relates generally to automatic control of softwareapplication programs and image analysis and, more specifically, toanalyzing graphical user interface (GUI) images displayed by anapplication program for automatic control of subsequent execution of theapplication program.

2. Description

Typical application program analysis systems capture keyboard input dataand mouse input data entered by a user. The captured input data may thenbe used to replay the application program. These systems rely onplayback of the application program on the same computer system used tocapture the input data, and thus are not portable.

Some existing application program analysis systems use image recognitiontechniques that are dependent on screen resolution and/or drawingschemes, or have strong dependencies to the underlying operating system(OS) being used. Such systems typically rely on dependencies such asWindows32 or X-Windows application programming interfaces (APIs). Thislimits their portability and usefulness.

Hence, better techniques for analyzing the GUIs of application programsare desired.

BRIEF DESCRIPTION OF THE DRAWINGS

The features and advantages of the present invention will becomeapparent from the following detailed description of the presentinvention in which:

FIG. 1 is a diagram of a cognitive control framework system according toan embodiment of the present invention;

FIG. 2 is a flow diagram illustrating processing in a cognitive controlframework according to an embodiment of the present invention;

FIG. 3 is an example display of the GUI of an application programcaptured and saved during a recording phase;

FIG. 4 is an example display of the GUI of an application programcaptured during a playback phase;

FIG. 5 is an example image illustrating objects identified duringcontouring operations of the recording phase according to an embodimentof the present invention;

FIG. 6 is an example image illustrating objects of activity of therecording phase according to an embodiment of the present invention;

FIG. 7 is an example image illustrating objects identified duringcontouring operations of the playback phase according to an embodimentof the present invention;

FIG. 8 is an example image illustrating a hypothesis during the playbackphase according to an embodiment of the present invention;

FIG. 9 is an example image illustrating a highlighted object accordingto an embodiment of the present invention;

FIG. 10 is an example image illustrating an object with no highlightingaccording to an embodiment of the present invention;

FIG. 11 is an example image illustrating an active object according toan embodiment of the present invention;

FIG. 12 is an example image illustrating an inactive object according toan embodiment of the present invention;

FIG. 13 is an example image illustrating a tool tip in a recorded imageaccording to an embodiment of the present invention;

FIG. 14 is an example image illustrating a tool tip in a playback imageaccording to an embodiment of the present invention;

FIG. 15 is an example image illustrating an index structure in a firstversion of a GUI according to an embodiment of the present invention;

FIG. 16 is an example image illustrating another index structure in asecond version of a GUI according to an embodiment of the presentinvention;

FIG. 17 is a flow diagram of noise reduction processing according to anembodiment of the present invention;

FIG. 18 is an example image illustrating a recorded image according toan embodiment of the present invention;

FIG. 19 is an example image illustrating a playback image according toan embodiment of the present invention;

FIG. 20 is an example image illustrating another recorded imageaccording to an embodiment of the present invention;

FIG. 21 is an example image illustrating another playback imageaccording to an embodiment of the present invention; and

FIG. 22 is a flow diagram illustrating GUI noise reduction processingaccording to an embodiment of the present invention.

DETAILED DESCRIPTION

Embodiments of the present invention comprise a cognitive controlframework (CCF) for automatic control of software application programsthat have a graphical user interface (GUI). Examples of suchapplications programs may be executed on current operating systems suchas Microsoft Windows® and Linux, for example, as well as other operatingsystems. An embodiment of the present invention creates a systemsimulating a human user interacting with the GUI of the applicationprogram and using the GUI for automatic control of the applicationprogram without relying on dependencies such as specific graphicallibraries, windowing systems, or visual controls interfaces orimplementations. The CCF comprises an easy-to-use cross-platform tooluseful for GUI testing based on pattern recognition. By beingindependent of any OS-specific controls and graphical libraries, the CCFmay be used for interaction with non-standard graphical interfaces aswell as with well known ones. The system provides for recording any kindof keyboard and mouse actions the user performs while working with theGUI of the application program and then providing playback of therecorded scenario. In the present invention, image analysis of captureddisplay data (such as screen shots, for example) is performed toidentify actions of the application program corresponding to user inputdata. These actions and input data may be stored for use in futureplayback of the same user scenario for automatically interacting withthe application program.

Embodiments of the present invention comprise operating on two phases: arecording phase and a playback phase. During the recording phase, thesystem is “learning” how to control the application program. The systemregisters and captures input actions supplied by the user (such as amouse click or entering of text via a keyboard, for example) and displaydata (e.g. screen shots) of images displayed by the application programin response to those actions. The user actions, the time intervalbetween actions, resulting display data of the GUI of the applicationprogram, and possibly other data and/or commands form an executionscenario. By following the execution scenario, during the playback phasethe system provides the same but fully automatic execution of theapplication program (simulating the user control but without the realpresence of the user). Automatic execution is made possible due to aplurality of image analysis and structural techniques appliedcorrespondingly to images during the recording and playback phases.

FIG. 1 is a diagram of a cognitive control framework (CCF) system 100according to an embodiment of the present invention. FIG. 1 shows twocomponents, recording component 102 and playback component 104. Thesecomponents may be implemented in software, firmware, or hardware, or acombination of software, firmware and hardware. In the recordingcomponent, the CCF system registers and captures user input activity atblock 106. For example, the user may make input choices over time to anapplication program being executed by a computer system using a mouse,keyboard, or other input device. This input data is captured and storedby the CCF system. Next, at block 108, the display data may be captured(e.g. screen shots are taken). In one embodiment, the display data maycaptured only when user input has been received by the applicationprogram. The display data is also saved. At block 110, the data capturedduring blocks 106 and 108 may be analyzed and saved. These processes maybe repeated a plurality of times. The result of the processing of therecording component comprises an execution scenario 112 for theapplication program being processed by the system. In one embodiment,the execution scenario comprises a script containing Extended MarkupLanguage (XML) tags. The execution scenario describes a sequence of userinputs to the application program, corresponding display images on a GUIof the application program, and commands directing the applicationprogram to perform some actions.

At a later point in time, during the playback phase the playbackcomponent 104 may be initiated. At block 114, simulated user activitymay be generated based on the execution scenario. That is, saved inputsand commands from the execution scenario may be input to the applicationprogram for purposes of automatic control using the CCF system. Whilethe application program processes this data, display data may be changedon the display as a result. At block 116, the CCF system performs imageanalysis on the playback display data currently being shown as a resultof application program processing and the display data captured duringthe recording phase. At block 118, recorded time conditions may bechecked to take into account possible variations in playback. Forexample, the time when an object appears may be within a time intervalbased on a recorded time. For example, in one embodiment a lower boundtime (time to start the search) may be extracted from the saved data inthe execution scenario and an upper bound time may be the lower boundtime plus 10%, or some other appropriate value. Processing of blocks114, 116, and 118 each result in data being stored in report 120. Atblock 119, the CCF system controls execution of the application programbased on the results of the image analysis. Blocks 114, 116 and 118 maybe repeated for each in a sequence of user input data items from theexecution scenario.

The time interval between sequential actions is a part of the capturedexecution scenario. However, while following the execution scenario inthe playback phase, one should not expect that the time interval betweenany two actions at playback will be equal to the time interval betweenthe same two actions during the recording phase. There are a number ofobjective reasons why this interval could be different on playback thanduring recording. For example, the application program during recordingand playback may be executed on different computer systems havingdifferent processor speeds, or an application program could requiredifferent times for the same actions during playback due to accesses ofexternal data or resources. This indicates a requirement in the CCFsystem to handle flexible time conditions, e.g. handle some tolerancefor the time interval between actions during the playback phase. Duringthat time interval at playback, the system checks the recorded displaydata to the playback display data several times to determine if theplayback display data is substantially similar to the recorded displaydata. A finding that the two are substantially similar indicates that aprevious user action has completed and the system can progress to thenext action in the execution scenario. This activity may be similar tothe situation where the user is interacting with the application programand pauses periodically to view the display to determine if the expectedvisible changes to the display have been made by the application programbased on previous actions. If so, then a new action may be performed. Ifat the end of a higher bound of the time interval the applicationprogram has not produced an image on the display that the CCF systemexpected according to the execution scenario, then the CCF system mayinterrupt the playback of the execution scenario and generate an errorreport describing how the execution scenario has not been followed. Inone embodiment, the scenario may be corrected and the CCF system may berequired to use other branches to continue.

The cognitive control framework (CCF) system of embodiments of thepresent invention performs image analysis and object detectionprocessing on display data from the GUI of the application program. TheCCF system includes comparing an image captured during a recording phase(called IR) to the corresponding image captured during the playbackphase (called IP). One task of the system is to detect an object in theIR to which the user applied an action, find the corresponding object inthe IP, and continue progress on the execution path of the executionscenario by applying the action to the detected object. These steps maybe repeated for multiple objects within an image, and may be repeatedacross multiple pairs of IRs and IPs over time. An object that the userhas applied an action to may be called an “object of action.” Absence inthe IP of the object of action corresponding to the one found at IRmeans that one should capture the IP again at a later time and try tofind the object of action again. Finally, either an object of action maybe found in the IP or execution of the scenario may be halted and areport generated describing how the wrong state was achieved and thescenario may not be continued. In embodiments of the present invention,this detection of objects of action may be done in real time during theplayback phase, progressing from one action to another. Thus, the imageanalysis process employed must have good performance so as to introduceonly a minimal disturbance to the time conditions at playback.

The CCF system of embodiments of the present invention comprises animage analysis and detecting process. Such a process has at least tworequirements. First, the process should be able to overcome somevariations in the captured images such as different color scheme, fonts,and the layout and state of the visual elements. In one embodiment,comparison constraints for checking these items (color scheme, fonts,etc.) may be set to specified parameters in accordance with specificneeds. Overcoming these variations is desirable because recording andplayback might be executed in different operating environments such asdifferent screen resolutions, different visual schemes, different windowlayouts, and so on. Additionally, there could be insignificantdifferences in corresponding IR (usually captured after an action wasapplied to an object of interest) and IP pairs (captured after aprevious action was completed). Second, the implementation of the imageanalysis and object detection process should be fast enough to introduceonly minimal disturbances and delay of application execution duringplayback.

By processing captured images, the system builds descriptions of theimages in terms of the objects presented on them. Each display objectmay be represented by its contour and a plurality of properties. Table Ienumerates some possible contour properties for use in the presentinvention. In other embodiments, other properties may also be used.

TABLE 1 Contour properties Property Description Location Coordinates (onthe image) of the contour center. Image size Characteristic contoursize. In case of rectangular contours they are just vertical andhorizontal sizes. For controls of more complicated shape, another formatmay be used. Layout Connection to other contours that lay in proximityto its boundaries/layout pattern of this contour. Content Indicates whatis inside of the contour: text, image or a Type combination. Content Ifthe content type is text, then a text string; if image (e.g. icon), thenthe image.

FIG. 2 is a flow diagram illustrating processing of a CCF systemaccording to an embodiment of the present invention. During therecording phase 220 handled by recording component 102, at block 200 thesystem determines contours of objects in the IR. At block 202, thesystem detects a current object of activity. At block 204, the systemdetects additional objects adjacent to the current object of activity inthe IR. These steps (200, 202, and 204) may be repeated over time forall objects of activity during execution of the application program inthe recording phase.

Next, during the playback phase 222 handled by playback component 104,at block 206 the CCF system determines the contours of objects in theIP. At block 208, the CCF system filters contours by size to determinecontours that may become hypotheses for active objects and contours thatconnect them. At block 210, the CCF system filters the objects by basicspace layout in the IP to determine subsets of hypotheses for active andadditional objects. For example, filtering criteria for space layout mayinclude tables, wizards, and menus. In one embodiment, the user (or CCFschema with a cascade search) could set both strict (e.g. “as is”) andfuzzy (e.g. “object could be near each other”) conditions. At block 212,the CCF system filters the objects by content to produce further subsetsof hypotheses for active and additional objects. For example, thefiltering criteria by content may include images and text. Moreover, inone embodiment, the user (or CCF schema with cascade search) could setboth strict (e.g. “image should have difference in a few points and textshould have minimal differences on a base of Levenstein distance”) andfuzzy (e.g. “image could be stable to highlighting and haveinsignificant structural changes and text could have noticeabledifferences on a base of Levenstein distance without consideration ofdigits”) conditions. At block 214, the CCF system performs structuralfiltering of the objects to produce a best hypothesis for activeobjects.

Finally, at block 216, the CCF system recalculates old actions for a newobject by applying the action according to the execution scenario. Forexample, suppose the user selected (via the mouse) the screen locationat (X=70, Y=200), and that a button is displayed at the rectangledenoted (X1=50, Y1=150, X2=100, Y2=100). In the IP, the button may berepresented as a rectangle denoted (X1=250, Y1=300, X2=200, Y2=100). Fora general view, coordinates of the top left corner and the size of therectangle may be changed. The mouse click (user selection) may berecalculated based on the position of the button and the scaled size(for X and Y coordinates). The calculation gives the new mouse clickcoordinates (e.g., X=290, Y=350).

Table II shows the input data and output of the image analysis processfor FIG. 2.

TABLE II Image Analysis Processing Input parameters Step Input DataResult and Description 1. Contouring Image from Contours Thresholds,recoding (IR) distances between objects (with some tolerance). Intel ®OpenCV library used in one embodiment. 2. Detecting object Image IR andContour Typical object size of activity contours from representing (withtolerance) previous step. object of for object of activity action.Optical character recognition (OCR) and fuzzy text comparison, e.g. withLevenshtein distance. 3. Detecting Image IR, Additional objects Typicalobject size additional objects contours and and their layout (withtolerance) around object of active objects. against object of foradditional activity action objects. Structural analysis, e.g.“criss-cross” rules. 4. Contouring Image from Contours Thresholds,playback (IP) distances between objects (with some tolerance). Intel ®OpenCV library used in one embodiment. 5. Filtering by size Contoursfrom Contours that Mean object size previous step become (withtolerance) hypotheses for based on active active object and objectcontours characteristics connected with evaluated at Step them 2.Typical object size (with tolerance) for additional objects. Filteringout contours that don't fit into input size limits. 6. Filtering bySubsets of Decreased Fuzzy distance basic space layout hypotheses forsubsets of filtration. Fuzzy active and hypotheses for filtration foradditional objects active and directions. additional objects 7.Filtering by Subsets of Decreased OCR and fuzzy content hypotheses forsubsets of text comparison, active and hypotheses for e.g. withadditional objects active and Levenshtein additional objects distance.Fuzzy image comparison. Using “fuzzy content type” method forfiltration. 8. Structural Subsets of The best Method based on filteringhypotheses for hypothesis for fuzzy triple links active and activeobjects. both between additional objects objects from IR and theirhypotheses from IP. It's stable to additional objects which don't havestrong structural links with active object. Moreover, one can use theresult of this method to choose the best hypotheses for active objects.Some other methods, e.g. Hough transformation may also be used here. 9.Recalculating Object of action Applied the action Recalculating oldactions for according to the action coordinates new object execution inIP (playback scenario image) coordinate system

During filtering at each step there is an evaluation of specific contourproperties (as required for a specific filter). This filtering pipelineis designed in such a way that the most time consuming evaluation stepsare shifted to later in the processing pipeline when the number ofcontours (hypotheses) is smaller. By using this approach, the overallcomputational cost may be decreased, thereby helping to ensure goodperformance of the system.

It is useful to maintain a compromise in order to make sure that thesystem does not filter out some contours in the early steps that may belater determined to be either a hypothesis of an object of activity orobjects connected with an object of activity. In this regard, predefinedinput parameters may be set to broad limits that requires spending alittle more time on processing of additional contours (hypotheses), butensure that the system has not dropped important contours.

Example pseudo-code for one embodiment of the present invention is shownin Table III

TABLE III Pseudo Code Example BEGIN CCF <<<<<<<< Recording >>>>>>>> LOOP/*recording, e.g. till a special key combination */ Wait on user action/*mouse, keyboard, it's possible to set something else*/ Hook and savescreenshot /*e.g. <Screenshot fileName=”1.png”/>*/ Save time intervalfrom the previous action /*e.g. <Sleep duration=”2000”/>*/ Saveinformation about user action /*e.g. <Mouse action=”RightClick” x=”100”y=”200”/>*/ END LOOP /*recording, e.g. till a special key combination*/EXIT <<<<<<<< Post-processing >>>>>>> Process saved data into a morecompact form. It's possible for the user to change it for his or herneeds. <<<<<<<< Playback >>>>>> LOOP /*till the end of saved data*/ Loadtime interval and wait in accordance with it. IF [actions depend oncoordinates on the screen] /*e.g. mouse click*/ THEN Load savedscreenshot Detect object of action /*e.g. button*/, neareststructure-layout /*e.g. menu items around button*/ and other useful infoon saved screenshot TimeConditions_label: Hook the current screenshotUse image processing to find the corresponding object on the currentscreenshot /*it's possible to require more information from savedscreenshot during search*/ IF [Object not found] THEN IF [Check timecondition] /*e.g. it's possible to repeat search 3 times with 1000-msecstep, for example*/ THEN GOTO TimeConditions_label ELSE EXIT with errorcode /*moreover, it's possible to send corresponding report tolog-file*/ END IF ELSE Recalculate actions on a base of new foundobjects /*e.g. recalculate new coordinates for mouse click*/ END IF  ENDIF Produce actions /*it could be changed actions after image processing;moreover, it's possible to finish execution in case of wrong situationsduring actions*/ END LOOP /*till the end of saved data*/ EXIT END CCF

Embodiments of the present invention including image analysis and objectof activity detection on two images may be illustrated by the followingexamples using a performance analyzer application program. These figuresshow applying the process blocks of FIG. 2 to a first image from therecording phase (IR) and a corresponding image from the playback phase(IP). FIG. 3 is an example display of the GUI of an application programcaptured and saved during a recording phase. This IR screen shot showsthat the item “Tuning Activity” was selected by the user using a mouse.FIG. 4 is an example display of the GUI of an application programcaptured during a playback phase. Note there are some insignificantchanges in the displayed windows in comparison to FIG. 3. FIG. 5 is anexample image illustrating objects identified during contouringoperations of the recording phase according to an embodiment of thepresent invention as performed on the image of FIG. 3. FIG. 5 shows thesample output from block 200 of FIG. 2. FIG. 6 is an example imageillustrating objects of activity of the recording phase according to anembodiment of the present invention as performed on the image of FIG. 5.These contours were identified after performing blocks 202 and 204 ofFIG. 2 on the image from FIG. 5. The contour with the text labeled“Tuning” has been determined in this example to be the current object ofactivity. FIG. 7 is an example image illustrating objects identifiedduring contouring operations of the playback phase according to anembodiment of the present invention. This image is output fromperforming block 206 of FIG. 2 on the sample image of FIG. 4. Finally,FIG. 8 is an example image illustrating a hypothesis during the playbackphase according to an embodiment of the present invention. FIG. 8 showshypotheses from FIG. 7 for the “Tuning Activity” object of activity fromFIG. 6. Size, space, content, and structural filtration of blocks206-214 has been performed. The ellipse represents the contour which wasselected as the best hypothesis from performing block 216 of FIG. 2. Anew point for the mouse click is recalculated relative to the givenobject (i.e., the “tuning” display object).

In some scenarios, filtration according to blocks 208 through 212 stillresult in many hypotheses to consider. The term hypothesis as usedherein means a contour of an object on the playback image whichcorresponds to a contour of an object on the recorded image at a pointin time. This means the previously applied filters didn't reject thiscorrespondence of objects. When the number of hypotheses is large, morecomputational resources are needed. In one embodiment of the presentinvention, a methods for hypotheses filtration may be used to reduce thenumber of GUI hypotheses for objects in space (two dimensional (2D) forscreen shots and multidimensional in the general case).

These methods comprise a search scheme that is simple yet powerful toselect the right hypotheses despite the presence of several GUI noiseconditions. In embodiments of the present invention, the noiseconditions may comprise changeable color schemes, highlighting of items,noise from video devices, anti-aliasing, and other effects. GUI noisehas two sources. The first source is the difference in GUIrepresentation from run to run of the CCF system for a given applicationprogram. Between runs, for example there could be different positions ofrows in a table, additional controls in a new product version, and so n.These differences result in changes in the playback images in comparisonto the recorded images, as shown in the example images of FIGS. 15 and16. The second source correlates with points of time when recorded andplayback images are captured, especially if the user used a computersystem's mouse for the user inputs to the application program undertest. Briefly, when a recorded image is captured during a mouse action(such as clicking on the “New” item of FIG. 9), it is correlated with acaptured playback image representing the results of the previous useraction (such as clicking on the “File” menu header in FIG. 10). Duringthe recording phase, the user moving the mouse from the “File” commandto the “New” command activates a highlighting operation that iscaptured. During the playback phase, the CCF system doesn't move themouse from the “File” position in the previous step because the positionof the “New” command is not yet known (and must still be determined).These things change the view of the playback image in comparison withthe corresponding recorded image.

Similar noise conditions are evident in FIGS. 11 and 12. Movement of amouse results in generation of animation of the “clam” icon in FIG. 11.A similar situation is shown in FIGS. 13 and 14. Movement of the mousegenerates the “Resource Perspective” tool tip during the recording phaseon FIG. 13. Movement of the mouse from the previous step on the playbackphase generated a “VTune™ Performance Tools Perspective” tool tip to bedisplayed, and the CCF system captured this event in the playback imageas shown in FIG. 14.

In one embodiment of the present invention, the Cognitive ControlFramework system reduces GUI noise after the recording phase for moreeffective play back. This embodiment correlates well with changes to theapplication program during product evolution. If an execution scenariois run several times, the CCF system will typically have problems withthe search every time. One goal is to modify screen shots from therecording phase using new images from the playback phase and correct thescenario (e.g., point at new coordinates for mouse clicks). After thisprocessing, the playback images will have fewer differences (GUI noise)than they had with initial screen shots from the recording phase. Thisallows users to select stronger conditions for playback, resulting inbetter performance and improved testing quality.

For clarity of presentation, let's consider specific real examples ofGUI noise during the recording phase. FIG. 9 is an example imageillustrating a highlighted object in an image captured during therecording phase according to an embodiment of the present invention. Theuser clicked the “New” item in the “File” menu. Detailed behavior for agiven activity could be the following (let's consider mouse activity,but the use of hot keys is also possible). The user moved the mouse fromthe “File” menu header to the “New” item and clicked on the “New” item.Highlighting for selected item is visible on the captured recordedimage. On the other hand, a screenshot which could be captured duringthe play back phase (as shown in FIG. 10) will not show thishighlighting because the CCF system task is searching for thecorresponding item and the mouse will be situated at another position.This difference in images is a simple example of GUI noise after therecording phase.

The next example shows a more difficult GUI problem. FIG. 11 is anexample image illustrating an active object according to an embodimentof the present invention. FIG. 11 presents an animated button which waspressed by mouse. FIG. 12 is an example image illustrating an inactiveobject according to an embodiment of the present invention. FIG. 12shows the usual view of the button when it is not animated.

A similar noise situation is illustrated on FIGS. 13 and 14. FIG. 13 isan example image illustrating a tool tip in a recorded image accordingto an embodiment of the present invention. FIG. 14 is an example imageillustrating a tool tip in a playback image according to an embodimentof the present invention. The recorded image shows a tool tip for acurrently selected object but the playback image shows a tool tip for apreviously pressed button.

Sometimes GUI changes during product evolution may be considered as aGUI noise. FIG. 15 is an example image illustrating an index structurein a first version of a GUI according to an embodiment of the presentinvention. FIG. 16 is an example image illustrating another indexstructure in a second version of a GUI according to an embodiment of thepresent invention. For example, FIG. 16 shows an additional item “QuickCopy & Move . . . ” in dialog and new icons for a new product version,but the old “significant” item still exists there also. New objectscould be considered in this situation as GUI noise in accordance with aprevious recorded image as shown in FIG. 15. This situation could beconsidered as evolution of the product with requirements to adapt to thenew changes.

In some situations there are more complex examples for GUI behavior andmore significant differences between the recording phase and theplayback phase screenshots. One possible way to solve these problemswith a search of objects is to create a new complex algorithm, with ahuge knowledge base. This is a very difficult task because the softwaremarketplace continually produces new versions of application programhaving new controls and different GUI behavior. Embodiments of thepresent invention comprise a more effective way to solve this problem.

Usually, search conditions for the playback phase are rather strong,especially for scenarios and tests where one of the general aims is tocheck for a stable product state at the current time. In this situation,the recorded execution scenario couldn't be successfully run in noisyenvironment. It will encounter significant differences between therecorded and playback screenshots from the point of strong conditions.It's possible to set less strict limits for the search. However, thisdecreases the potential for verification of bugs and it has a verynegative effect on using effectiveness of the present invention. Ifrecorded screenshots correspond to the correct and the current images itallows conditions to be kept without changes. Thus, the general problemis how to obtain “clear” screenshots with the right data aboutactivities for further use in CCF system processing.

Let us assume a stable application program or controlling the program'sexecution and the ability to indicate a valid run of the CCF system forit. The execution scenario could be recorded, “softer” conditions setfor playback, and the execution scenario run again in a selfregistration mode with control for valid execution. This means that thesystem captures screenshots and tries to find objects with more adaptiverules for search. The CCF system then saves the screenshot and newinformation about objects (for example, new coordinates for mouseactions). The CCF system should be able to save additional informationfrom the executed scenario for its integrity. In other words, a newexecution scenario should have updates for information correlated toscreenshots (it could include coordinates, areas, images and textcontent, updated dynamical information, etc). Finally, visualinformation may be obtained which has better correspondence with thereal images during further playbacks. This allows the CCF system toreturn to “strict” conditions to control application programs for bugverification and other tasks.

Methods for GUI noise reduction could be complex, but this complexity isless than the corresponding complexity for investigation and developmentof a search scheme in a GUI noise environment. Moreover, knowledge aboutan application program that is valid for execution could help toautomate refreshing of the execution scenario through multiply runs incase of playback faults.

Embodiments of the present invention include methods which makeeffective refreshing of the execution scenarios. In other embodiments,additional methods may be used to improve GUI noise reduction that leadsto easy portability and updates for scenarios between differentoperating systems and versions during product evolutions. An embodimentof the present invention includes a hierarchy of methods (e.g.,algorithms, schemas, “strict” conditions, etc.) as shown below.

-   -   1. At first, use “strict” methods. The playback images typically        don't have significant differences with the recording images, so        it makes sense to use narrow bounds for each parameter, e.g.,        size of contours, text content, layout, etc.    -   2. Use stable to highlighting methods because this is the usual        situation for GUI dialogs, menus, etc. The simplest change which        could take place is highlighting, and a method which is stable        may be used for handling highlighting.    -   3. Use environment information because this data usually has        less noise than a searched active object. The simplest example        of this is animation: an active object on the playback image        (such as in FIG. 12) is different from one on the recording        image (such as in FIG. 11); additional objects are similar.    -   4. Use text information because action changes could be only in        the fonts used. A simple change which could take place is        another text font and a method may be used which is stable for        handling font changes.    -   5. Use additional delays during tests execution (but save old        test runs) because this helps avoid trouble with early “soft”        searching. An additional load on the operating system could        affect execution time for the application program, so it's        possible to use delays when capturing the screenshot and search        objects during playback to promote obtaining a good result.    -   6. Use a complex search and the best hypothesis chosen through        corresponding methods, e.g., “fuzzy” ones. Sometimes the        application program could provide a different view on every run,        so the user could provide his or her own complex methods to find        active objects. For example, an iterative search procedure may        be used.    -   7. Divide the execution scenario into modules because this helps        refresh and tune the execution scenario over multiple runs. Some        steps (row of steps) could be replayed with one type of search        parameters and others with different search conditions; it's        easier to split a task into sub-tasks and solve them        individually.    -   8. Use “as is” play back scheme and/or manual updating for        screenshots and recorded information in very difficult        situations. Manual correction for the scenario could help, and        the playback may be processed with “point-to-point” conditions        (e.g., manually select the same point with the mouse during        playback as it took place during the recording phase).

In at least one embodiment of the present invention, at a first stage ofprocessing, the recorded image and the execution scenario are obtainedfor every step of operation of the application program under test. Next,“soft” search conditions may be set. Soft search conditions include afirst set of bounds for differences in shapes of contours, text andimage content, layout, etc. Then, the execution may be played backautomatically using the CCF system. FIG. 18 is an example imageillustrating a recorded image according to an embodiment of the presentinvention. It is a result of a valid execution and soft conditions beingapplied. For every step of the operation of the application program, anew playback image may be generated. Soft search conditions allow theCCF system to attempt to find the right active object in spite of GUInoise. The recorded and newly generated playback images are saved againand the execution scenario may be modified with new coordinates formouse (or other user input) activity. Typically, this is the only changedone to update an execution scenario. FIG. 19 is an example imageillustrating a playback image according to an embodiment of the presentinvention. In this example, FIG. 19 does not quite match FIG. 18.

At a second stage of processing, the search conditions may be set to“stronger” or “stricter” search conditions for applying the executionscenario to verify execution of the application program. Strict searchconditions include a second set of bounds for differences in shapes ofcontours, text and image content, layout, etc. The first set of boundsof differences is different than the second set of bounds ofdifferences. At this time, the recorded images are screenshots obtainedfrom playback processing using “soft” search conditions as in FIG. 19.Thus, FIG. 20 is an example image illustrating another recorded imageaccording to an embodiment of the present invention. This recorded imagewas copied from the previously generated playback image. For every stepof the execution scenario, a new playback image may be generated duringplayback, such as is shown in FIG. 21. This playback image, in the caseof the right workflow of the application program, is similar to therecorded image of FIG. 20. This processing allows the CCF system tobetter adapt to any inconsistencies in the application program.

FIG. 22 is a flow diagram illustrating GUI noise reduction processingaccording to an embodiment of the present invention. At block 300, theexecution scenario may be recorded. At block 302, soft search conditionsmay be set for the application program for the same execution scenario.At block 304, the application program may be played back under thecontrol of the CCF system according to the execution scenario with softconditions for the search. During playback the execution scenario may beupdated, including saving new recorded images (taken from generatedplayback images) and coordinates for user inputs such as mouseselections. At block 306, stronger conditions for the search may then beset for use in subsequent playbacks. At block 308, the applicationprogram may be played back again under the control of the CCF systemaccording to the updated execution scenario with stronger searchconditions. This process may be repeated until a satisfactory result isachieved. An active object should be found because the applicationprogram is known to work correctly during a given playback and theobject should exist.

An advantage of embodiments of the present invention is that it isapplicable to any application program exposing a visual interface on anyplatform and operating system, and is not dependent on a specific API,or architecture of visual system implementation (like Win32 or X-WindowsAPI), or specific operating system. It correlates with an advantage ofthe overall Cognitive Control Framework approach, which works acrossplatforms. All other known systems are dependent to a small or largeextent on system APIs while working with visual elements. A furtheradvantage of this approach is that it is an easy way to clear screenshots from unnecessary user activity effects. It further decreases theproblems during playback. Another advantage is that it is an easy way tohelp with automatic portability of old scenarios to new versions ofproducts. It decreases the time needed to support a baseline ofscenarios for application program testing. Another advantage is that itis a scalable way for using new algorithms for solving the executionscenario refresh task.

Reference in the specification to “one embodiment” or “an embodiment” ofthe present invention means that a particular feature, structure orcharacteristic described in connection with the embodiment is includedin at least one embodiment of the present invention. Thus, theappearances of the phrase “in one embodiment” appearing in variousplaces throughout the specification are not necessarily all referring tothe same embodiment.

Although the operations detailed herein may be described as a sequentialprocess, some of the operations may in fact be performed in parallel orconcurrently. In addition, in some embodiments the order of theoperations may be rearranged without departing from the scope of theinvention.

The techniques described herein are not limited to any particularhardware or software configuration; they may find applicability in anycomputing or processing environment. The techniques may be implementedin hardware, software, or a combination of the two. The techniques maybe implemented in programs executing on programmable machines such asmobile or stationary computers, personal digital assistants, set topboxes, cellular telephones and pagers, and other electronic devices,that each include a processor, a storage medium readable by theprocessor (including volatile and non-volatile memory and/or storageelements), at least one input device, and one or more output devices.Program code is applied to the data entered using the input device toperform the functions described and to generate output information. Theoutput information may be applied to one or more output devices. One ofordinary skill in the art may appreciate that the invention can bepracticed with various computer system configurations, includingmultiprocessor systems, minicomputers, mainframe computers, and thelike. The invention can also be practiced in distributed computingenvironments where tasks may be performed by remote processing devicesthat are linked through a communications network.

Each program may be implemented in a high level procedural or objectoriented programming language to communicate with a processing system.However, programs may be implemented in assembly or machine language, ifdesired. In any case, the language may be compiled or interpreted.

Program instructions may be used to cause a general-purpose orspecial-purpose processing system that is programmed with theinstructions to perform the operations described herein. Alternatively,the operations may be performed by specific hardware components thatcontain hardwired logic for performing the operations, or by anycombination of programmed computer components and custom hardwarecomponents. The methods described herein may be provided as a computerprogram product that may include a machine accessible medium havingstored thereon instructions that may be used to program a processingsystem or other electronic device to perform the methods. The term“machine accessible medium” used herein shall include any medium that iscapable of storing or encoding a sequence of instructions for executionby a machine and that cause the machine to perform any one of themethods described herein. The term “machine accessible medium” shallaccordingly include, but not be limited to, solid-state memories, andoptical and magnetic disks. Furthermore, it is common in the art tospeak of software, in one form or another (e.g., program, procedure,process, application, module, logic, and so on) as taking an action orcausing a result. Such expressions are merely a shorthand way of statingthe execution of the software by a processing system cause the processorto perform an action of produce a result.

1. A computer-implemented method of reducing graphical user interface(GUI) noise comprising: recording a first execution scenario for controlof operation of an application program having a GUI during a recordingphase of operation of a cognitive control framework system; setting softconditions for a search for the application program for the firstexecution scenario; playing back the application program according tothe first execution scenario during a playback phase of operation of thecognitive control framework system; updating the first executionscenario to form a second execution scenario to reduce GUI noiseconditions observed during playback, including updating recorded imagesoriginally generated by the GUI during the recording phase and updatingcoordinates for user input data; setting stronger conditions for thesearch for use in subsequent playbacks; and playing back the applicationprogram according to the second execution scenario with the strongerconditions for search.
 2. The method of claim 1, wherein the softconditions comprise a first set of bounds for differences in shapes ofcontours, text and image content, or layout.
 3. The method of claim 2,where the stronger conditions comprise a second set of bounds fordifferences in shapes of contours, text and image content, or layout,the second set being different than the first set.
 4. The method ofclaim 1, wherein the user input data comprises mouse selections.
 5. Themethod of claim 1, wherein GUI noise conditions comprise at least one ofchangeable color schemes, highlighting of items, noise from videosources, and anti-aliasing effects.
 6. The method of claim 1, whereinupdating recorded images comprises using playback images as recordedimages for subsequent playbacks.
 7. An article comprising: a machineaccessible medium containing instructions, which when executed, resultin reducing graphical user interface (GUI) noise by recording a firstexecution scenario for control of operation of an application programhaving a GUI during a recording phase of operation of a cognitivecontrol framework system; setting soft conditions for a search for theapplication program for the first execution scenario; playing back theapplication program according to the first execution scenario during aplayback phase of operation of the cognitive control framework system;updating the first execution scenario to form a second executionscenario to reduce GUI noise conditions observed during playback,including updating recorded images originally generated by the GUIduring the recording phase and updating coordinates for user input data;setting stronger conditions for the search for use in subsequentplaybacks; and playing back the application program according to thesecond execution scenario with the stronger conditions for search. 8.The article of claim 7, wherein the soft conditions comprise a first setof bounds for differences in shapes of contours, text and image content,or layout.
 9. The article of claim 8, where the stronger conditionscomprise second set of bounds for differences in shapes of contours,text and image content, or layout, the second set being different thanthe first set.
 10. The article of claim 7, wherein the user input datacomprises mouse selections.
 11. The article of claim 7, wherein GUInoise conditions comprise at least one of changeable color schemes,highlighting of items, noise from video sources, and anti-aliasingeffects.
 12. The article of claim 7, wherein instructions to updaterecorded images comprise instructions to use playback images as recordedimages for subsequent playbacks.
 13. A method of automaticallycontrolling execution of an application program having a GUI to reduceGUI noise comprising: capturing user input data and images displayed bythe GUI during a recording phase of execution of the applicationprogram; analyzing the captured user input data and recorded images togenerate a first execution scenario during the recording phase; settingsoft conditions for a search for the application program for the firstexecution scenario; generating simulated user input data based on thefirst execution scenario during a playback phase of execution of theapplication program and inputting the simulated user input data to theapplication program; performing image analysis on playback imagesdisplayed by the GUI as a result of processing the simulated user inputdata during the playback phase and the recorded images; updating thefirst execution scenario to form a second execution scenario to reduceGUI noise conditions observed during playback, including updating therecorded images originally generated by the GUI during the recordingphase and updating coordinates for user input data; setting strongerconditions for the search for use in subsequent playbacks; and playingback the application program according to the second execution scenariowith the stronger conditions for search.
 14. The method of claim 13,wherein the soft conditions comprise a first set of bounds fordifferences in shapes of contours, text and image content, or layout.15. The method of claim 14, where the stronger conditions comprise asecond set of bounds for differences in shapes of contours, text andimage content, or layout, the second set being different than the firstset.
 16. The method of claim 14, wherein GUI noise conditions compriseat least one of changeable color schemes, highlighting of items, noisefrom video sources, and anti-aliasing effects.
 17. The method of claim14, wherein updating recorded images comprises using playback images asrecorded images for subsequent playbacks.
 18. An article comprising: amachine accessible medium containing instructions, which when executed,result in automatically controlling execution of an application programhaving a GUI to reduce GUI noise by capturing user input data and imagesdisplayed by the GUI during a recording phase of execution of theapplication program; analyzing the captured user input data and recordedimages to generate a first execution scenario during the recordingphase; setting soft conditions for a search for the application programfor the first execution scenario; generating simulated user input databased on the first execution scenario during a playback phase ofexecution of the application program and inputting the simulated userinput data to the application program; performing image analysis onplayback images displayed by the GUI as a result of processing thesimulated user input data during the playback phase and the recordedimages; updating the first execution scenario to form a second executionscenario to reduce GUI noise conditions observed during playback,including updating the recorded images originally generated by the GUIduring the recording phase and updating coordinates for user input data;setting stronger conditions for the search for use in subsequentplaybacks; and playing back the application program according to thesecond execution scenario with the stronger conditions for search. 19.The article of claim 18, wherein the soft conditions comprise a firstset of bounds for differences in shapes of contours, text and imagecontent, or layout.
 20. The article of claim 19, where the strongerconditions comprise a second set of bounds for differences in shapes ofcontours, text and image content, or layout, the second set beingdifferent than the first set.
 21. The article of claim 18, wherein GUInoise conditions comprise at least one of changeable color schemes,highlighting of items, noise from video sources, and anti-aliasingeffects.
 22. The article of claim 18, wherein instructions to updaterecorded images comprise instructions to use playback images as recordedimages for subsequent playbacks.